Historic Robots: Polly

Arthur Ed LeBouthillier

This article appeared in the November 1999 issue of The Robot Builder.

Polly was a robot built at MIT’s AI lab between 1992 and 1993 by Ian Horswill. It represented a capable autonomous vision-guided robot able to effectively navigate through its environment and interact with people. As its designer said, “Polly was
designed to patrol the seventh floor of the laboratory, find visitors, and give them tours.[2]” It did this with only a single on-board processor. It gave hundreds of tours and rolled throughout the lab for upwards of 2 hours continuously before its
batteries needed charging.

Polly’s hardware was not so unique, based on a commonly-used, off-the-shelf B12 robot base made by RWI. It used an off-the-shelf TMS320C30 DSP card as the main processor. The processor was used to capture an image from a camera and perform all of the necessary decision-making and generate the motor commands.

What made Polly so capable was the innovative vision system able to differentiate between the floor and obstacles, identify people by their movement and use its knowledge of obstacles to navigate without a map.

Polly’s Vision System

Polly’s vision system was based on a simple idea: “…take an image, use some criterion to discard pixels that look like the floor, and avoid driving toward the remaining pixels[1].” Polly’s camera generated images which were sent to the main processor through a frame grabber. It provided medium-resolution images which were sub-sampled down to 64 X 48 pixels with 15 gray levels.

The key to making the whole system works lies in the ability to distinguish the floor from non-floor objects. In order to do this, Polly used two simplifying assumptions: the Ground Plane Constraint and the Background Texture Constraint. The Ground Plane Constraint assumed that all obstacles rest on the ground plane and are completely contained within the boundary region where they contact the floor. The Background Texture Constraint assumed that the environment is uniformly lighted and the ground has no texture that cannot be thresholded out easily. As long as the world reflected these constraints, Polly was able to identify the floor and, therefore, avoid all non-floor pixels.

In reality, the world doesn’t meet the two constraints and so Polly failed in its task at times. A table, for example, does not meet the Ground Plane Constraint because the tabletop does not contact the floor. Tables were a danger for Polly because it
could only see the table’s legs but would hit the horizontal tabletop. The Background Texture Constraint was also not always met because floor reflections, shadows or floor stains would create apparent false obstacles.

Once Polly took a picture, it would threshold the image to signify the difference between the floor and non-floor. It was then a simple process to scan vertically from bottom to top to identify the nearest object in the visual field. Knowing this, the program
then classified obstacle characteristics by the following criteria:
open-left? open-region? open-right?
blind? wall-ahead? blocked? light-floor?
wall-far-ahead? left-turn? dark-floor?
right-turn? farthest-direction

Using these criteria, the robot could then test a number of conditions to generate the proper motor commands. Polly recognized people due to the fact that they moved in its field of view from frame to frame; otherwise, it recognized them as obstacles. It recognized corridors by looking for a boundary between the floor and obstacles which extended towards the horizon. In figure 1, an image has been thresholded, creating an image where obstacles are white and the floor is
black.

 Polly then scanned from bottom to top to identify the closest objects to the left, center and right. Using this knowledge, it could generate motor commands to turn left, right or to turn around. Polly performed these routines 15 times per second.

Place Recognition

Another feature implemented in Polly was place recognition. Polly was able to recognize particular corridor features such as intersections and corners so that it could plan on visiting various places and reference its generated speech. It did this by having a store of low-resolution images of features such as an intersection of two hallways. The most recent image frame was constantly compared against several different corridor images and a close match, coupled with the robots knowledge of its location and other general features, indicated that the robot was at a particular place.

Corridor Following

Polly also had a corridor recognition algorithm which identified corridors by searching for converging lines that projected to the horizon. Knowing this information, Polly could orient itself in the corridor and ensure proper navigation down the center.

Motor Control Generation

Polly’s motion was controlled by three separate systems. The Corridor Follower caused Polly to continuously move forward and stay in the center of a hallway. The Obstacle Avoider kept the robot a certain distance from obstacles and would even
cause Polly to back up if someone got too close. A Turn Controller took over in order to allow controlled turns at key places (i.e. hallway corners and intersections). An overall goal control system mediated between these systems in order to make
the robot go from one place to another, based on either recognition of the place or knowledge of its approximate coordinates.

Performance

Polly ran at upwards of 1 meter per second and was able to operate in the corridors of the MIT AI building for hours at a time. During its operational period, it conducted hundreds of tours. Variations of this navigation scheme have been used on other
robots and have demonstrated long-duration, capable, autonomous vision-based navigation.

Summary

Polly represents a capable vision-guided robot which operated in realtime. Because of the low image resolution, the  processing system was able to constantly take an image, determine obstacles and generate motor commands rapidly. The techniques developed for Polly have applications for hobbyist robots since low processing power is needed to create a robot that visually avoids obstacles in real-time.


References
[1] Horswill, Ian. “Visual Collision Avoidance by Segmentation”May 26, 1994.
[2] Horswill, Ian. “The design of the Polly system -draft-“ April 30, 1996.